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Abstract: We respond to some criticism questioning the validity of the current Standard 
Model Higgs exclusion limits at the Tevatron, due to the significant dependence of the 
dominant production cross section from gluon-gluon fusion on the choice of parton distri- 
bution functions (PDFs) and the strong coupling {as)- We demonstrate the ability of the 
Tevatron jet data to discriminate between different high-x gluon distributions, performing 
a detailed quantitative comparison to show that fits not explicitly including these data fail 
to give a good description. In this context we emphasise the importance of the consistent 
treatment of luminosity uncertainties. We comment on the values of as obtained from 
fitting deep-inelastic scattering data, particularly the fixed-target NMC data, and we show 
that jet data are needed for stability. We conclude that the Higgs cross-section uncertain- 
ties due to PDFs and as currently used by the Tevatron and LHC experiments are not 
significantly underestimated, contrary to some recent claims. 
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1 Introduction 

Discovery or exclusion of the Standard Model Higgs boson (H) at the Tevatron and Large 
Hadron Collider (LHC) requires precise knowledge of the theoretical cross section; see, for 
example, refs. [1-3], and references therein. Cross-section predictions for the dominant pro- 
duction channel of gluon-gluon fusion {gg — t- H) are strongly dependent on both the gluon 
distribution in the proton and the strong coupling as, which enters squared at leading-order 
(LO) with sizeable next-to-leading order (NLO) and next-to-next-to-leading order (NNLO) 
corrections. In particular, the Tevatron Higgs analysis [4, 5], with current exclusion at 95% 
confidence-level (C.L.) for a Standard Model Higgs boson mass Mh G [158, 173] GeV [5], 
requires knowledge of the gluon distribution at relatively large momentum fractions x > 0.1 
where constraints from data on deep-inelastic scattering (DIS) or Drell-Yan production are 
fairly weak. In this paper, which accompanies a separate paper [6], we respond to several 
(related) issues which have been raised in recent months [7-12], particularly regarding the 
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use of parton distribution functions (PDFs) determined from limited data sets in making 
predictions for the Tevatron (and LHC) Higgs cross sections, as alternatives to the most 
common choice of the MSTW 2008 PDFs [13] used in the Tevatron [4, 5] and LHC [3] 
Higgs analyses. 

First in section 2 we demonstrate explicitly how the gg ^ H cross sections depend 
on the Standard Model Higgs boson mass M//, the gluon-gluon luminosity function and 
the choice of asiM^), by comparing predictions obtained using PDFs (and as values) 
from various different PDF fitting groups. In section 3 we present a detailed quantitative 
comparison of the quality of the description of Tevatron jet data using different PDF 
sets. The MSTW 2008 analysis [13] is the only current NNLO PDF fit which includes the 
Tevatron jet data, providing the only direct constraint on the high-rc gluon distribution. In 
section 4 we examine the different values of the strong coupling as used by the different 
PDF groups, particularly those values mainly extracted from DIS data, and we look at the 
constraints arising from different sources. In section 5 we respond to recent claims [11] that 
the theoretical treatment of the longitudinal structure function Fl for the NMC data [14] 
can explain the bulk of the difference between predictions for Higgs cross sections calculated 
using either the MSTW08 [13] or ABKM09 [15] PDFs. Finally we conclude in section 6 that 
MSTW08 is presently the only fully reliable PDF set for calculating Higgs cross sections 
at NNLO, particularly if sensitive to the high-x gluon distribution, and that the recent 
exclusion bounds [4, 5] obtained by the Tevatron experiments are robust based upon this 
choice. 

2 Dependence of Higgs cross sections on PDFs and as 
2.1 Dependence on Higgs mass 

We show the NLO and NNLO gg ^ H total cross sections (cth) versus the Standard Model 
Higgs boson mass Mh in figure 1 at the Tevatron (centre-of-mass energy, = 1.96 TeV) 
and the LHC {y/s = 7 TeV) for different PDF sets and a fixed scale choice of fin = fip = 
Mh, calculated with settings given in section 4.2 of ref. [6]. At NLO [16], we use the 
corresponding NLO PDFs (and as values) from MSTW08 [13], CTEQ6.6 [17], CTIO [18] 
and NNPDF2.1 [19], ah of which are fully global fits to HERA and fixed-target DIS data, 
fixed-target Drell-Yan production, and Tevatron data on vector boson and jet production. 
At NNLO [20], we use the corresponding NNLO PDFs (and as values) from MSTW08 [13], 
ABKM09 [15], JR09 [21, 22] and HERAPDFl.O [23], where in the last case no uncertainty 
PDF sets are provided and the two curves correspond to a5(M^) = 0.1145 and a5(M^) = 
0.1176, with the larger as value giving the larger Higgs cross section. For the other PDF 
sets, we compute the "PDF-l-as'" uncertainty at 68% C.L. according to the recommended 
prescription of each group, summarised in ref. [6]. The data sets included in the MSTW08 
fit at NNLO are the same as at NLO, with the omission of HERA data on jet production, 
while the ABKM09 and JR09 fits only include DIS and fixed-target Drell-Yan data. The 
HERAPDFl.O fit only includes combined HERA I inclusive DIS data, while the other 
NNLO fits (MSTW08, ABKM09, JR09) instead include the older separate data from HI 
and ZEUS. However, including the combined HERA I data [23] in a variant of the MSTW08 
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Figure 1. <7h vs. Mh with PDF+ag uncertainties at 68% C.L. for gg ^ H calculated at (a) NLO 
at the Tevatron, (b) NNLO at the Tevatron, (c) NLO at the LHC, and (d) NNLO at the LHC. 



fit was found to have little effect on predictions for Higgs cross sections [24]. The NNPDF 
fits parameterise the starting distributions at Qq = ^ GeV^ as neural networks, whereas 
other groups all use the more traditional approach of parameterising the input PDFs as 
some functional form in x, each with a number of free parameters, which varies significantly 
between groups. Contrary to the "standard" input parameterisation at Qg > 1 GeV^, the 
JR09 set uses a "dynamical" parameterisation of valence-like input distributions at an 
optimally chosen < 1 GeV^ , which gives a slightly worse fit quality and lower as values 
than the corresponding "standard" parameterisation, but is nevertheless favoured by the 
JR09 authors. More details on differences between PDF sets are given in section 2 of 
ref. [6]; see also the descriptions in refs. [25-27]. 

The size of the higher-order corrections to the gg ^ H total cross sections is substan- 
tial. Taking the appropriate MSTW08 PDFs and as values consistently at each perturba- 
tive order for an with Mh = 160 GeV, then the NLO/LO ratio is 2.1 (Tevatron) or 1.9 
(LHC), the NNLO/LO ratio is 2.7 (Tevatron) or 2.4 (LHC), and so the NNLO/NLO ratio is 
1.3 (Tevatron and LHC). The perturbative series is therefore slowly convergent, mandating 
the use of (at least) NNLO calculations together with the corresponding NNLO PDFs and 
as values. The convergence can be improved by using a scale choice = fip = Mh/2, 
which mimics the effect of soft-gluon resummation. However, the goal of this paper is to 
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Figure 2. Ratio to MSTW08 gg — > H cross section at Tevatron with PDF+as uncertainties for 
(a) NLO at 68% C.L., (b) NLO at 90% C.L., (c) NNLO at 68% C.L., (d) NNLO at 90% C.L. 



study only the PDF and dependence of the gg ^ H cross sections, and we do not 
aim to come up with a single "best" prediction together with a complete evaluation of all 
sources of theoretical uncertainty. We do not consider, for example, optimal (factorisation 
and renormalisation) scale choices and variations, electroweak corrections, the effect of 
threshold resummation, (C^ vr a5')'^-enhanced terms, use of a finite top-quark mass in the 
calculation of higher-order corrections, bottom-quark loop contributions, etc. The PDF 
and as dependence roughly decouples from these other, more refined, aspects of the cal- 
culation, and therefore the findings regarding PDFs and as reported here will be relevant 
also for more complete calculations found, for example, in refs. [1, 2] or the recent Handbook 
of LHC Higgs Cross Sections [3]. 

The ratios of the cross sections with respect to the MSTW08 predictions are shown 
for the Tevatron in figure 2 and for the LHC in figure 3, where PDF+a^ uncertainty 
bands at both 68% and 90% C.L. are plotted. It can be seen that there is generally good 
agreement between the global fits at NLO. However, at NNLO, the ABKM09 prediction, 
and the HERAPDFl.O prediction with the lower as value, are well below MSTW08 at 
the Tevatron, even allowing for the 90% C.L. PDF+as' uncertainties, with a significant 
discrepancy also at the LHC. 

Baglio, Djouadi, Ferrag and Godbole (BDFG) [9] have claimed that some publicly 
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Figure 3. Ratio to MSTW08 gg ^ H cross section at 7 TeV LHC witli PDF+as uncertainties 
for (a) NLO at 68% C.L., (b) NLO at 90% C.L., (c) NNLO at 68% C.L., (d) NNLO at 90% C.L. 



available PDFs, specifically the HERAPDFl.O NNLO set with as(M|) = 0.1145, can lower 
the Tevatron Higgs cross section by up to 40% compared to MSTW08 for Mh ~ 160 GeV, 
requiring more than twice as much Tevatron data to recover the same sensitivity as the 2010 
analysis by the Tevatron experiments [4], which used MSTW08 for the central prediction. 
This is obviously potentially very worrying. However, figure 2(c,d) shows that the lowest 
cross section occurs not with either of the HERAPDF sets, but with ABKM09, where the 
central cross section is ~ 75% that of MSTW08 at Mh ~ 160 GeV. The cross-section ratios 
for ABKM09 and JR09 in figure 2(d) seem close to those in the inset of figure 1 of ref. [9], 
but we do not reproduce the extreme behaviour of the HERAPDFl.O sets. Our results 
are supported by those in ref. [10] where it is also observed that ABKM09 gives lower 
Higgs cross sections at the Tevatron than the HERAPDFl.O set with asiM"^) = 0.1145. 
One obvious difference is the scale choice = = -^h/2 used in ref. [9] rather than 
= fJ-F = Mh used here and in ref. [10]. However, we have checked that the ratio of cross 
sections with respect to MSTW08 is largely independent of the different scale choice. The 
detailed arguments of ref. [9] assume the "worst-case scenario" of a 40% reduction in an 
at Mh w 160 GeV from the central value of HERAPDFl.O with asiM"^) = 0.1145, and 
therefore the conclusions require modification if there is a mistake in their HERAPDFl.O 
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calculations.^ Nevertheless, even the 25% reduction in an at Mh ~ 160 GeV from the 
central value of ABKM09 is still a problem, as it lies well outside both the MSTW08 
rDF+as uncertainty at 90% C.L. used in ref. [4] and the PDF4LHC^ uncertainty used in 
ref. [5]. (These two prescriptions for uncertainties give similar results, but the former is 
clearly much simpler; see section 5 of ref. [6] for more discussion.) We note that in justifying 
the use of the HERAPDF set, BDFG [9] make the statement: "However, HERAPDF 
describes well not only the Tevatron jet data but also the W , Z data. Since this is a 
prediction beyond leading order, it has also the contributions of the gluon included. This 
gives an indirect test that the gluon densities are predicted in a satisfactory way." This 
statement is very misleading: the W charge asymmetry and the Z rapidity distribution at 
the Tevatron, used as a PDF constraint, are almost insensitive to the gluon distribution, 
and the statement makes no reference to the quantitative comparison of PDFs to jet data. 
In the rest of this paper we will present a number of arguments to show that, of all the 
currently available NNLO PDF sets, only MSTW08 provides a fully reliable estimate of 
the Higgs cross sections at the Tevatron and LHC. 

2.2 Dependence on gg luminosity 

At LO, the PDF dependence of the gg ^ H total cross section is simply given by the 
gluon-gluon luminosity evaluated at a partonic centre-of-mass energy \/I = Mu, 

^ = ^/^/.(-,^)/.(-M^), (2.1) 

where fg{x,fj? = s) is the gluon distribution and t = s/s. In figure 4 we show the gluon- 
gluon luminosities calculated using different PDF sets and taken as the ratio with respect 
to the MSTW 2008 value, at centre-of-mass energies corresponding to the (a,b) Tevatron 
and (c,d) LHC. The relevant values of \/I = Mh = {120,180,240} GeV are indicated, 
along with the threshold for production at the LHC, \/I = 2mt with mj = 171.3 GeV, 
where this process is predominantly ^p-initiated at the LHC. Indeed, tt production at the 
LHC is strongly correlated with gg ^ H production at the Tevatron, with both processes 
probing the gluon distribution at similar x values, as seen from figure 4. We point out in 
ref. [6] that the current ti cross-section measurements at the LHC [29, 30] seem to distinctly 
favour MSTW08 over ABKM09. 

The NLO luminosities in figure 4(a,c) are shown for the global fits from MSTW08 [13], 
CTEQ6.6 [17], CTIO [18] and NNPDF2.1 [19]. The NNLO luminosities in figure 4(b,d) 
are shown for MSTW08 [13], HERAPDFl.O [23], ABKM09 [15] and JR09 [21, 22]. The 
two HERAPDFl.O NNLO curves shown are for both as(M|) = 0.1145 and 0.1176, where 
the latter gives the smaller gg luminosity at low s values and the larger gg luminosity at 
high s values. The larger as value means that less gluon is required at low x to fit the 

^We thank J. Baglio for confirming that the IfERAPDFl.O curves in figure 1 of ref. [9] were erroneously 
drawn with ^b. ~ IJ-f ~ {3/2)Mh, to be corrected in an erratum included in v3 of the preprint version [9]. 

^The PDF4LHC recommendation [28] is to rescale the MSTW08 NNLO PDF+as uncertainty at 68% 
C.L. by the ratio of the envelope of the MSTW08 NLO, CTEQ6.6 NLO and NNPDF2.0 NLO predictions, 
all including PDF+qs uncertainties at 68% C.L., to the MSTW08 NLO PDF+as 68% C.L. uncertainty. 
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Figure 4. Gluon-gluon luminosities as the ratio with respect to MSTW 2008 for (a) NLO at the 
Tevatron, (b) NNLO at the Tevatron, (c) NLO at the LHC, and (d) NNLO at the LHC. 



scaling violations of HERA data, dF2/ d\ii{Q'^) ~ as9-, therefore more gluon is required 
at high X from the momentum sum rule. Both these effects, larger as and more high-x 
gluon, raise the Tevatron Higgs cross section and improve the quality of the description 
of Tevatron jet data, as we will see in section 3. The NNLO trend between groups is 
similar to at NLO [6]. There is reasonable agreement for the global fits, but more variation 
for the other sets, particularly at large s, where HERAPDFl.O and ABKM09 have much 
softer high-j; gluon distributions, and this feature has a direct impact on the gg ^ H cross 
sections, particularly at the Tevatron (see figure 2). 

2.3 Dependence on strong coupling as 

The various PDF fitting groups take different approaches to the values of the strong cou- 
pling as and, for consistency, the same value as used in the fit should be used in subsequent 
cross-section calculations. The values of as{M'^), and the corresponding uncertainties, for 
MSTW08, ABKM09 and GJR08/JR09 are obtained from a simultaneous fit with the PDF 
parameters. Other groups choose a fixed value, generally close to the world average [31], 
and for those groups we assume a l-cr uncertainty of ±0.0012 [26], very similar to the 
MSTW08 uncertainty. The central values and 1-a uncertainties are depicted in figure 5 as 
the larger symbols and error bars, while the smaller symbols indicate the PDF sets with 
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at (a) NLO and (b) NNLO. The smaUer symbols indicate the PDF sets with alternative values of 
Q;s(Af|) provided by each fitting group. The shaded band indicates the world average as{M^) [31]. 



alternative values of as{M'^) provided by each fitting group. The fitted NLO as{M'^) 
value is always larger than the corresponding NNLO 05 (M^) value in an attempt by the 
fit to mimic the missing higher-order corrections, which are generally positive. The world 
average as{M^) [31], shown in figure 5, combines determinations made at a variety of 
perturbative orders, but in most cases an increase in the order corresponds to a decrease 
in the value of as{M'^) obtained. 

The gg ^ H cross sections at the Tevatron and LHC start at 0{a'g) at LO, with 
anomalously large higher-order corrections, therefore they are directly sensitive to the value 
of 05 (M^). Moreover, there is a known correlation between the value of as and the gluon 
distribution, which additionally affects the gg ^ H cross sections. In figures 6 and 7 we 
show this sensitivity by plotting the Higgs cross sections versus asiM^) at the Tevatron 
and LHC for Higgs masses Mh = {120,180,240} GeV. We plot both NLO and NNLO 
predictions for a fixed scale choice fiR = fJ'F = Mh- The format of the plots is that the 
markers are centred on the default as{M'^) value and the corresponding predicted cross- 
section of each group. The horizontal error bars span the as{M'^) uncertainty, the inner 
vertical error bars span the "PDF only" uncertainty where possible (i.e. not for ABKM09 or 
GJR08/JR09, where is mixed with the input PDF parameters in the error matrix), and 
the outer vertical error bars span the PDF+ag uncertainty. The effect of the additional as 
uncertainty is sizeable. The dashed lines at NLO or the solid lines at NNLO interpolate the 
cross-section predictions calculated with the alternative PDF sets provided by each group, 
represented by the smaller symbols in figure 5. The NNLO plots in figure 7 also show 
the NLO predictions (open symbols and dashed lines) together with the corresponding 
NNLO predictions (closed symbols and solid lines) to explicitly demonstrate how the size 
of the NNLO corrections depends on both the 05 (M^) choice and the PDF choice. It 
is apparent from the plots that at least part of the MSTW08/ABKM09 discrepancy for 
Higgs cross sections is due to using quite different values of as{M'^) at NNLO, specifically 
as{Ml) = 0.1135 ± 0.0014 for ABKM09 [15] compared to a5(Af|) = 0.1171 ± 0.0014 
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Figure 6. gg H total cross sections, plotted as a function of q;s(M|), at NLO. 

for MSTWOS [13, 32]. Comparing cross-section predictions at the same value of 05 (M^) 
would reduce the MSTW08/ABKM09 discrepancy at the LHC, but there would still be a 
significant discrepancy at the Tevatron (see also the later table 5 in section 5). 

2.4 Theoretical uncertainties on as 

In ref. [32] we gave a prescription for calculating the "PDF-|-a5" uncertainty on an ob- 
servable such as a hadronic cross section, due to only experimental errors on the data 
fitted. An estimate of the theoretical uncertainty on as was given as ±0.003 at NLO 
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Figure 7. gg H total cross sections, plotted as a function of as(M|), at NNLO. 



and at most ±0.002 at NNLO, where these values should be interpreted as roughly 1-cr 
(68% C.L.). However, this additional uncertainty was not recommended to be propagated 
to the "PDF+a5" uncertainty on cross sections, in the same way that theoretical errors 
on PDFs are not generally provided and propagated to uncertainties on cross sections. It 
was intended simply to be an estimate of how much the value of as{M^) might change 
if extracted at even higher orders. It has subsequently been proposed (by Baglio and 
Djouadi) to include the theoretical uncertainty on as in the cross-section calculation for 
the gg ^ H process at the Tevatron [7] and LHC [8], which somewhat reduces the ap- 
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Figure 8. Effect of including an additional theoretical uncertainty on as on the 90% C.L. PDF+ag 
uncertainty for gg ^ H at (a) NLO at the Tevatron, (b) NNLO at the Tevatron, (c) NLO at the 
LHC, and (d) NNLO at the LHC. 



parent inconsistency between MSTW08 and ABKM09 seen in figures 2 and 3. In figure 8 
we show the effect of adding in quadrature an additional theoretical uncertainty on as 
to the 90% C.L. MSTW08 PDF+a^ uncertainty for the gg H cross sections at both 
the Tevatron and LHC, at both NLO and NNLO, plotted as a function of the Higgs mass 
Mh-^ If a similar theoretical uncertainty on as was also added to the ABKM09 uncertainty 
band, which includes only experimental uncertainties, then the MSTW09 and ABKM09 
uncertainty bands would overlap at the Tevatron, at least in the Mh range shown here. 
However, even if the additional as uncertainty is applied in this manner, it is misleading to 
claim that it leads to more of an agreement in the predictions obtained using the two PDF 
sets, since variations of cross sections with as are very highly correlated between different 
PDF sets. We will see in the rest of this paper that differences between groups in as 
values, gluon distributions and Higgs cross sections are largely due to the selection of data 
fitted, and it is not the case that the discrepancies should be attributed to unaccounted 
theoretical uncertainties. 

^We calculate the cross sections evaluated with as{Mz) = 0.120 ± 0.003 at NLO and «s(M|) = 
0.117 ± 0.002 at NNLO, to determine the variation due to the additional theoretical uncertainty on as at 
68% C.L., then we scale this uncertainty by 1.64485 to get the 90% C.L. theoretical uncertainty. 
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3 Constraints from jet production at the Tevatron 

Here we present a quantitative study of the description of the Tevatron Run II inclusive 
jet data [33-35] and dijet data [36] by different PDF sets. The goal is to compare the 
description of Tevatron jet data in a similar manner to the benchmark cross-section study 
of ref. [6], i.e. we use the same code and settings for all NLO and NNLO PDF sets (with 
the correct as value for each set) to ensure that observed differences are only due to the 
PDF choice rather than any other factor. We do not consider the less reliable Tevatron 
Run I data, which prefer a much harder high- a; gluon distribution [13], and are obtained 
using less sophisticated jet algorithms. The three data sets on inclusive jet production 
from the Tevatron Run II [33-35] were ah found to be compatible [13]. The MSTW 2008 
analysis [13] included the CDF Run II inclusive jet data using the jet algorithm [33] and 
the D0 Run II inclusive jet data using a cone jet algorithm [35]. Consistency was checked 
with the CDF Run II inclusive jet data using the cone-based Midpoint jet algorithm [34], 
but this data set was not included in the final MSTW08 fit, since it is essentially the same 
measurement (using 1.13 fb~^) as ref. [33] (using 1.0 fb~^), differing mainly by the choice 
of jet algorithm. The kx jet algorithm is theoretically preferred due to its property of 
infrared safety, and the corresponding CDF Run II data [33] was already published and 
implemented in the MSTW08 analysis by the time the CDF Run II Midpoint data [34] 
appeared. The D0 Run II inclusive jet data [35] and dijet data [36], both defined using a 
cone jet algorithm, are also measured from essentially the same 0.7 fb~^ of data, differing 
mainly by the kinematic binning, so as with the two CDF data sets it would be double- 
counting to include both in the same PDF extraction. We will concentrate on the inclusive 
jet data (section 3.2), but we will also make a first quantitative comparison to the more 
recent D0 dijet data (section 3.3). However, first in section 3.1 we precisely define the 
goodness-of-fit measure used for the comparison of data and theory. 

One obvious problem is that the complete NNLO partonic cross section (cj) for inclu- 
sive jet production is currently unknown, and needs to be approximated with the NLO a 
supplemented by 2-loop threshold corrections [37], while even these 2- loop threshold cor- 
rections are unavailable for the dijet cross section. We calculate jet cross sections using 
FASTNLO [38] (based on NLOJET++ [39, 40]), which includes these 2-loop threshold cor- 
rections. Following the usual way of estimating theoretical uncertainties due to unknown 
higher-order corrections, we take different scale choices /xr = hf = fi = {pt/2,pt,2pt} as 
some indication of the theoretical uncertainty. Smaller scale choices raise the partonic cross 
section, so favour softer high-x gluon distributions [13], and the central = pT was chosen 
for the final MSTW08 fit [13]. We comment on the scale dependence in section 3.4, we 
present distributions of pulls and systematic shifts in section 3.5, we briefiy discuss other 
collider data on jet cross sections in section 3.6, then finally we summarise our findings in 
section 3.7. 

3.1 Definition of goodness-of-fit, 

It is important to account for correlated systematic uncertainties of the experimental data 
points. The full correlated error information is accounted for by using a goodness-of-fit 
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(x^) definition given by [41, 42] 

i=l V « / fc=l 

where Tj are the theory predictions and 

^ corr. 

A = A - E (3-2) 

fc=l 

are the data points allowed to shift by the systematic errors in order to give the best fit. 
Here, i = 1, . . . , A'^pts. labels the individual data points and k = 1, . . . , A'^corr. labels the 
individual correlated systematic errors. The data points A have uncorrelated (statistical 
and systematic) errors and correlated systematic errors cr^°"'. Minimising the 

in eq. (3.1) with respect to the systematic shifts gives the analytic result that [41, 42] 

r-fc = E {^"^k'Bk', (3.3) 



=1 



where 



^pts. corr. _corr. Npts. corr. 1 7-) 



Akk' - hk' + E ('^uncorr. \2 ' ~ E 77^ISSH??:T2 ' (^•'^) 

i=l ^ i ' i=l ^ i ^ 

and 6kk' is the Kronecker delta. Therefore, the optimal shifts of the data points by the 
systematic errors, eq. (3.2), are solved for analytically. Here we use the same notation^ as 
in the MSTW08 paper [13]. We treat the luminosity uncertainty as any other correlated 
systematic. However, we find that the relevant systematic shift riumi. ~ 3-5 for some PDF 
sets with soft high-x gluon distributions (e.g. ABKM09 and HERAPDFl.O), which is clearly 
completely unreasonable, as it means that the data points are normalised downwards by 
3~5 times the nominal luminosity uncertainty (around 6% for both CDF and D0). The 
penalty term r^^^^j will contribute only 9-25 units to the total given by eq. (3.1), which 
can therefore still lead to reasonably low overall values (see appendix A for details). 

It is the usual situation at collider experiments that the luminosity determination is 
common to all cross sections measured from a given data set (see, for example, refs. [44, 45]), 
so the requirement of a single common luminosity is mandatory when fitting multiple mea- 
surements taken during a single running period. In figure 9 we compare NNLO predictions 
for the W and Z total cross sections at the Tevatron Run II, calculated in the zero-width 
approximation with settings described in ref. [6]; see also similar comparisons in ref. [10]. 
The format of the plots in figure 9 is the same as for the gg ^ H cross sections in sec- 
tion 2.3, i.e. we show the cross-section predictions plotted against as{M'^). We compare 
to CDF Run II data on W [46] and Z [47] total cross sections, and to D0 Run II data 
on the Z total cross section [48]. The thicker horizontal lines in figure 9 indicate the 
central value of each experimental measurement, the thinner horizontal lines indicate the 



*We note a typo, already pointed out in ref. [43], in the formula for yl^fc' in eq. (40) of ref [13] where 
^uncorr. gjjQ^jjj appear squared. This typo is corrected in eq. (3.4) above. 
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Figure 9. NNLO predictions for (a) W and (b) Z total cross sections at the Tevatron Run II, 
plotted as a function of as(M|), compared to CDF W [46], CDF Z [47] and D0 Z [48] data. 



statistical and systematic (excluding luminosity) uncertainties added in quadrature, while 
the shaded regions indicate the total uncertainty obtained by also adding the luminosity 
uncertainty in quadrature. The plotted CDF Z measurement with 2.1 fb^^ [47] supersedes 
the earlier Z measurement with 72 pb~^ [46], but both measurements are dominated by 
the (common) luminosity uncertainty. The D0 experiment has not published any dedi- 
cated W and Z total cross-section measurements from Run II at the Tevatron. The D0 
Z total cross section shown in figure 9(b) was obtained as part of the Z-|-jet measure- 
ment [48]. The CDF measurement [47] is defined as the Z/^* — )■ ee cross section in an 
invariant mass range Mf>e G [66, 116] GeV, while the D0 measurement [48] is defined as 
the Z/"f* — )■ yU/i cross section in an invariant mass range M^^ S [65,115] GeV. We have 
therefore multiplied the CDF and D0 data by factors of 1.006 and 1.004, respectively, 
derived using the vrap code [49] at NNLO with MSTW08 PDFs, to correct to the Z-only 
cross section with Mu = Mz- We note from figure 9 that the MSTW08, ABKM09 and 
JR09 NNLO predictions for the W and Z total cross sections at the Tevatron are in good 
agreement with the CDF data [46, 47], and lie around l-a above the D0 data [48]. In 
the MSTW08 fit [13], the luminosity shift for the CDF jet data was correctly tied to be 
the same as for the more-constraining CDF Z rapidity distribution, daz/dy [47], which 
therefore effectively acted as a luminosity monitor. The optimal CDF normalisation in the 
MSTW08 NNLO fit [13] was found to be very close to the nominal value, therefore it is 
not surprising that the CDF Z total cross section is well described in figure 9(b). The D0 
experiment instead measured the Z rapidity shape distribution, {l/az)daz/dy [50], also 
included in the MSTW08 fit, which is one reason why the D0 jet data were found to be 
less constraining than the CDF jet data; see ref. [32]. The optimal D0 normalisation in the 
MSTW08 NNLO fit [13], determined only from jet data, was around l-a above the nominal 
value, consistent with the D0 Z total cross section shown in figure 9(b). If the Tevatron 
jet data were normalised downwards by 20-30% (i.e. 3-5 times the luminosity uncertainty), 
the Tevatron W and Z total cross sections would need to normalised downwards by the 
same amount, resulting in complete disagreement with all theory predictions shown in fig- 
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ure 9. This example illustrates the utility of simultaneously fitting W and Z cross sections 
together with jet cross sections at the Tevatron (and LHC). The luminosity shifts, common 
to both data sets, are effectively determined by the more precise W and Z cross sections. 
The luminosity uncertainty is then effectively removed from the jet cross sections, thereby 
allowing the jet data to provide a tighter constraint on the gluon distribution (and as)- 

To avoid these completely unrealistic luminosity shifts, Humi. ~ 3-5, without going into 
the complication of simultaneously including W and Z cross sections in the computation, 
we will calculate the values for the Tevatron jet data using eq. (3.1), but with the simple 
restriction that the relevant systematic shift |numi.| ^ 1- More practically, this means that 
if |riumi I > 1 for any particular PDF set, we fix Humi, at ±1 and reevaluate eq. (3.1) 
with the luminosity removed from the list of correlated systematics. However, we note 
from figure 9 that the ABKM09 predictions are slightly above the central value of the 
CDF W and Z data, and the HERAPDFl.O predictions are higher by around 1-a, while 
both ABKM09 and HERAPDFl.O lie above the 1-a hmit of the D0 Z data. Allowing 
luminosity shifts downwards by even 1-a is therefore distinctly generous, particularly for 
HERAPDFl.O, and upwards luminosity shifts would bring the ABKM09 and HERAPDFl.O 
predictions into better agreement with the CDF W and Z data, and especially the D0 Z 
data. Therefore, it should be understood that the values quoted in the tables we will 
present in section 3.2 and 3.3 are rather optimistic for ABKM09 and HERAPDFl.O, and 
more realistic constraints in the luminosity shifts would result in even worse values. 

The form of eq. (3.1) is slightly different from the treatment of normalisation uncer- 
tainties adopted in eq. (38) of the MSTW08 paper [13], but is the form used, for example, 
in the CTIO analysis [18]. Rescaling only the central value of the data in eq. (3.1), but 
not the uncertainties, leads to so-called "d'Agostini bias" [51, 52]. However, since we are 
only comparing and not fitting PDFs, we use the simpler form of eq. (3.1) which has the 
major advantage that all shifts can be solved for analytically. A more sophisticated ap- 
proach to the treatment of normalisation uncertainties may somewhat lessen the preference 
of some PDF sets for large downwards luminosity shifts, but should not affect our main 
conclusions. The normalisation uncertainties were treated as multiplicative rather than 
additive in the MSTW08 fit [13], i.e. the uncertainties were correctly rescaled to reduce 
bias. Moreover, large normalisation shifts for any experiment were discouraged through 
use of a quartic penalty term rather than the usual quadratic penalty term in eq. (3.1). 
These small differences in definition mean that the MSTW08 x^ values we quote here 
will be slightly different from the values quoted in ref. [13]. 

Even considering the constraint on the CDF and D0 luminosities from the comparison 
to the weak boson cross sections (see figure 9), it might be considered that imposing 
jnumi.l < 1 is too restrictive if the luminosity uncertainty is assumed to be Gaussian. 
However, as another reason for limiting the luminosity shifts to some extent, we note 
that it has been claimed (see section 6.7.4 on "Normalizations", pg. 170" in [53]) that, for 
many experiments, quoted normalisation uncertainties represent the limits of a box-shaped 
distribution rather than the standard deviation of a Gaussian distribution. This was one 
motivation for the more severe quartic penalty term for normalisation uncertainties in the 
MSTW08 analysis; see discussion in section 5.2.1 of ref. [13]. Nevertheless, if we instead 
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impose Inumi. I < 2 rather than InumiJ < 1, then the change in the x^/^pts. values for the 
most relevant ABKM09 NNLO PDF set with = is {2.76 2.10, 1.94 1.81, 1.55 
1.55,1.49 ^ 1.41} for the {CDF kr [33], CDF Midpoint [34], D0 inclusive [35], D0 
dijet [36]} data, respectively, so there is not a significant improvement in the values. 
However, as discussed above, our main argument does not rely on the precise form of 
the uncertainty on the luminosity determination, but that we can use the W and Z cross 
sections as a luminosity monitor, where the predictions have small theoretical uncertainties, 
effectively providing an accurate luminosity determination independently of the CDF and 
D0 values. Combining these arguments, we consider allowing luminosity shifts downwards 
by more than l-cr to be excessively generous. 

There is a clear trade-off between the systematic shifts and the parameters of the 
gluon distribution. Deficiencies in the theory calculation can be masked to some extent by 
large systematic shifts, therefore it is important to check that the optimal values are not 
unreasonable. This is straightforward when using a definition like eq. (3.1), but is more 
difficult using an equivalent form written in terms of the experimental covariance matrix. 



T/ X /'^uncorr.\2 i \ ^ _corr. _corr. /o k\ 



k=\ 



Then eq. (3.1) is equivalent [41] to the more traditional form written in terms of the 
inverse of the experimental covariance matrix: 



= E E - (^^' - ^«'), (3.6) 

i=\ i'=l 

as used by the ABKM and NNPDF fitting groups. More precisely, NNPDF use a refine- 
ment to treat normalisation errors as multiplicative [52], while Alekhin (ABKM) treats all 
correlated systematic errors as multiplicative [54, 55]. 

It can easily be seen from eqs. (3.5) and (3.6) that treating the correlated errors as 
uncorrelated (yni oc bni') leads to the familiar form of 

x^=E(^)'' M 

i=i ^ « ^ 

where the total error is simply obtained by adding all errors in quadrature, 

corr. 



k=\ 

3.2 Inclusive jet production 

In tables 1, 2 and 3 we give the per data point, calculated using eq. (3.1) with the 
restriction |rim„i.| < li for the Tevatron Run II data on inclusive jet production [33-35], for 
different PDF sets and different scale choices = [ip = \i = {pt/'2.,Pt,'^Pt}, where pT 
is the jet transverse momentum. For NNPDF2.1 the jet cross sections are averaged over 
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ATT ( \ T^TW? ATT ( \ ^\ 

INLU r\Jr (with INLU a) 


= 


= Pt/2 


H = PT 




= 


MRST04 


1.06 


(0.59) 


0.94 (0.31) 


0.84 


(0.31) 


MSTW08 


0.75 


(0.30) 


0.68 (0.28) 


0.91 


(0.84) 


CTEQ6.6 


1.25 


(0.14) 


1.66 (0.20) 


2.38 


(0.84) 


U 1 iU 


1 no 
i.Uo 


(0.13) 


1 on n 1 n\ 
i.zU (0.i9j 


1.81 


(0.84j 


NNPDF2.1 


0.74 


(0.29) 


0.82 (0.25) 


1.23 


(0.69) 


HERAPDFl.O 


2.43 


(0.39) 


3.26 (0.66) 


4.03 


(1.67) 


HERAPDF1.5 


2.26 


(0.40) 


3.05 (0.66) 


3.80 


(1.66) 


ABKM09 


1.62 


(0.52) 


2.21 (0.85) 


3.26 


(2.10) 


GJR08 


1.36 


(0.23) 


0.94 (0.13) 


0.79 


(0.36) 



NNLO PDF (with NLO+2-loop d) 


fi = pt/2 


H = PT 


fl = 2pT 


MRST06 
MSTW08 

HERAPDFl.O, as{M\) = 0.1145 
HERAPDFl.O, as(M|) = 0.1176 
ABKM09 
JR09 


2.96 (1.24) 
1.39 (0.42) 
2.64 (0.36) 
2.24 (0.35) 
2.55 (0.82) 
0.75 (0.37) 


1.21 (1.18) 
0.69 (0.44) 
2.15 (0.36) 
1.17 (0.32) 
2.76 (0.89) 
1.26 (0.41) 


1.03 (0.84) 
0.97 (0.48) 

2.20 (0.46) 
1.23 (0.31) 
3.41 (1.17) 

2.21 (0.49) 



Table 1. Values of x^/^pts. for the CDF Run II inelusive jet data using the kx jet algorithm [33] 
with A^pts. = 76 and iVcorr. = 17, for different PDF sets and different scale choices = /if = yu = 
{pt/2,pt, "^Pt}- The values are calculated accounting for all 17 sources of correlated systematic 
uncertainty, using eq. (3.1), including the 5.8% normalisation uncertainty due to the luminosity 
determination. At most a 1-cr shift in normalisation is allowed. We highlight in bold those values 
lying inside the 90% C.L. region, defined by eq. (3.9), which gives x^/^pts. < 0.83. The values 
of x^/iVpts. computed using eq. (3.7), simply adding all experimental uncertainties in quadrature 
(including luminosity), are shown in brackets in the table. If the theory prediction was identically 
zero, the x^/^pts. values would be 25.0 (37.5) with (without) accounting for correlations between 
systematic uncertainties. 

100 replica sets. We give the x^/^pts. values defined by simply adding all uncertainties 
in quadrature, eq. (3.7), in brackets in the tables. In this case many PDF sets and scale 
choices give a /^pts. ^ Ij so the consistent treatment of correlated uncertainties is 
vital for the jet data to discriminate. In the table captions we give the x^ values with an 
identically zero theory prediction, Tj = 0, just to illustrate how the correlated systematic 
shifts can partially accommodate a clearly inadequate theory prediction. We highlight in 
bold the values lying inside the 90% C.L. region defined as 

X' < {£) ^90, (3.9) 

where ^50 and ^90 are the 50th and 90th percentiles of the x^-distribution with iVpts. degrees 
of freedom. (These quantities are defined in detail in section 6.2 of ref. [13].) Here, Xo is de- 
fined as the lowest x^ value of all theory predictions in each table, i.e. assumed to be close to 
the best possible fit, so that the rescaling factor Xo/^50 ™ eq. (3.9) empirically accounts for 
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= Pr/2 


fI=PT 




= ^Vt 


MRST04 


2.14 


(1.42) 


2.01 (0.54) 


1.57 


(0.26) 


MSTW08 


1.52 


(0.61) 


1.40 (0.27) 


1.16 


(0.73) 


CTEQ6.6 


1.93 


f r\ A'\\ 

(0.41) 


1.98 (0.21) 


1.78 


(0.78) 


U 1 iU 


i. /O 


(O.OO ) 


1.69 (0.19j 


1.50 


(0.76) 


NNPDF2.1 


1.69 


(0.60) 


1.56 (0.25) 


1.44 


(0.60) 


HERAPDFl.O 


2.61 


(0.23) 


2.73 (0.49) 


2.53 


(1.58) 


HERAPDF1.5 


2.48 


(0.24) 


2.60 (0.49) 


2.44 


(1.57) 


ABKM09 


1.56 


(0.26) 


1.68 (0.65) 


1.69 


(2.01) 


GJR08 


2.11 


(0.71) 


1.75 (0.24) 


1.52 


(0.31) 



NNLO PDF (with NLO+2-loop a) 


H = Pt/2 


fl=PT 


/X = 2pT 


MRST06 
MSTW08 

HERAPDFl.O, as(M|) = 0.1145 
HERAPDFl.O, as(M|) = 0.1176 
ABKM09 
JR09 


2.83 (2.25) 
1.67 (0.62) 
2.20 (0.25) 
2.08 (0.55) 
1.70 (0.50) 
1.57 (0.41) 


2.08 (1.56) 
1.39 (0.43) 
2.06 (0.27) 
1.76 (0.33) 
1.94 (0.71) 
2.05 (0.36) 


2.11 (0.86) 
1.62 (0.37) 
2.19 (0.40) 
1.99 (0.23) 
2.26 (1.12) 
2.82 (0.39) 



Table 2. Values of x^/^pts. for the CDF Run II inclusive jet data using the cone-based Midpoint 
jet algorithm [34] with A^pts. = 72 and Neon-. = 25, for different PDF sets and different scale choices 
Mfl = A*F = A* = {pt/2,pt, 2pt}- The values are calculated accounting for all 25 sources of 
correlated systematic uncertainty, using eq. (3.1), including the 5.8% normalisation uncertainty due 
to the luminosity determination. At most a 1-cr shift in normalisation is allowed. We highlight in 
bold those values lying inside the 90% C.L. region, defined by eq. (3.9), which gives x^/^pts. < 1-43. 
The values of x^/-^pts. computed using eq. (3.7), simply adding all experimental uncertainties in 
quadrature (including luminosity) , are shown in brackets in the table. If the theory prediction was 
identically zero, the x^/^pts. values would be 5.30 (38.8) with (without) accounting for correlations 
between systematic uncertainties. 

any unusual fluctuations preventing the best possible fit having ~ Cso — -^pts. [41]. The 
90% C.L. region given in this way is used to determine the PDF uncertainties according to 
the "dynamical tolerance" prescription introduced in ref. [13], so PDF sets with values 
far outside this region cannot be considered to give an acceptable description of the data. 
We consider NLO PDFs from MRST04 [56], MSTW08 [13], CTEQ6.6 [17], CTIO [18], 
NNPDF2.1 [19], HERAPDFl.O [23], HERAPDF1.5 (preliminary) [57], ABKM09 [15] and 
GJR08 [58, 59]. We consider NNLO PDFs from MRST06 [60], MSTW08 [13], HER- 
APDFl.O [23], ABKM09 [15] and JR09 [21, 22]. The MRST04 and MRST06 fits only 
included Tevatron Run I data [61, 62], and were superseded by the MSTW08 fits, but we 
show the values here just to demonstrate that these older fits do not give a good descrip- 
tion of the newer Tevatron Run 11 data due to their harder high-x gluon distribution. The 
CTEQ6.6 fit includes only the Tevatron Run I data [61, 62], while the CTIO fit includes 
Run II data [34, 35] in addition to the Run I data [61, 62], contrary to the MSTW08 and 
NNPDF2.1 fits which include only Run II data [33, 36]. The GJR08 fit included some Run 
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iNLL) FDl^ (with iNLL) a) 
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= 2pT 


MRST04 
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(0.30) 


MSTW08 


1.45 


(0.89) 
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1.05 


(1.22) 


CTEQ6.6 
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(1.15) 
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(1.35) 
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NNPDF2.1 


1.41 


(0.87) 


1.29 (0.20) 


1.22 


(0.96) 


HERAPDFl.O 


1.73 


(0.27) 


1.84 (0.74) 


1.83 


(2.79) 


HERAPDF1.5 


1.78 


(0.29) 


1.87 (0.75) 


1.84 


(2.81) 


ABKM09 


1.39 


(0.35) 


1.43 (1.07) 


1.63 


(3.66) 


GJR08 


1.90 


(1.46) 


1.34 (0.45) 


1.03 


(0.51) 



NNLO PDF (with NLO+2-loop a) 


H = Pt/2 


fl=PT 


H = 2pT 


MRST06 
MSTW08 

HERAPDFl.O, as(M|) = 0.1145 
HERAPDFl.O, as(M|) = 0.1176 
ABKM09 
JR09 


3.19 (5.00) 
1.95 (0.90) 
2.11 (0.37) 
2.28 (0.95) 
1.68 (0.79) 
1.84 (0.47) 


1.77 (3.22) 
1.23 (0.44) 
1.68 (0.35) 
1.50 (0.40) 
1.55 (1.21) 
1.61 (0.36) 


1.25 (1.50) 
1.08 (0.35) 
1.41 (0.63) 
1.17 (0.21) 
1.63 (2.04) 
1.58 (0.50) 



Table 3. Values of x^/iVpts. for the D0 Run II inchisive jet data using a cone jet algorithm [35] 
with A'^pts. = 110 and -/Vcorr. = 23, for different PDF sets and different scale choices = hf = l-i = 
{pt/'^tPt, '^Pt}- The values are calculated accounting for all 23 sources of correlated systematic 
uncertainty, using eq. (3.1), including the 6.1% normalisation uncertainty due to the luminosity 
determination. At most a 1-cr shift in normalisation is allowed. We highlight in bold those values 
lying inside the 90% C.L. region, defined by eq. (3.9), which gives x^/^pts. < 1-22. The values 
of x^/iVpts. computed using eq. (3.7), simply adding all experimental uncertainties in quadrature 
(including luminosity), are shown in brackets in the table. If the theory prediction was identically 
zero, the x^/^pts. values would be 7.46 (65.7) with (without) accounting for correlations between 
systematic uncertainties. 

I [62] and Run II [63] data, while the JR09, ABKM09 and HERAPDF fits did not include 
any Tevatron jet data. 

The most constraining data set appears to be the CDF Run II inclusive jet data 
using the /cy jet algorithm [33] (see table 1) where, other than MSTW08, only NNPDF2.1 
gives an acceptable description for = px, while HERAPDFl.O and ABKM09 typically 
give x^/^pts. ~ 2-3, and CTEQ6.6/CT10 give better values but still much worse than 
MSTW08 (and NNPDF2.1). The GJR08/JR09 sets and the HERAPDFl.O NNLO set 
with Qs'(M^) = 0.1176 give a reasonable description, at a similar level to CTIO, and 
give predictions for gg ^ H cross sections at the Tevatron which are much closer to the 
MSTW08 predictions than those from ABKM09 and the HERAPDFl.O NNLO set with 
as{M'^) = 0.1145. The same trend is apparent, but to a somewhat lesser extent, for the 
CDF Run II inclusive jet data using the cone-based Midpoint jet algorithm [34] (see table 2) 
and the D0 Run II inclusive jet data using a cone jet algorithm [35] (see table 3). 

In figures 10, 11 and 12 we compare the description of the Tevatron inclusive jet data by 
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Figure 10. Data/theory ratios for the CDF Run II inclusive jet data using the kr jet algorithm [33] 
with Npts. = 76 and A^corr, = 17, for MSTW08 and ABKM09 NNLO PDFs with NLO partonie cross 
sections supplemented by 2-loop threshold corrections, with scale choice fiR = fip = Pt, and (a) all 
experimental errors added in quadrature, then (b) accounting for correlated systematic uncertainties 
using eq. (3.1) and showing only the uncorreiated experimental errors. 
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(b) 



CDF Run II inclusive jet data (Midpoint) 

(data after systematic sfiifts, show uncorreiated errors) 
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Figure 11. Data/theory ratios for the CDF Run II inclusive jet data using the cone-based Midpoint 
jet algorithm [34] with Npt.. = 72 and iVcorr. = 25, for MSTW08 and ABKM09 NNLO PDFs 
with NLO partonie cross sections supplemented by 2-loop threshold corrections, with scale choice 
Mfl. ^ M-F = Pt, and (a) all experimental errors added in quadrature, then (b) accounting for 
correlated systematic uncertainties using eq. (3.1) and showing only the uncorreiated experimental 
errors. 
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(a) 



D0 Run II inclusive jet data (cone, R = 0.7) 

(data points before systematic shifts, show total errors) 



(b) 



D0 Run II inclusive jet data (cone, R = 0.7) 

(data after systematic shifts, show uncorrelated errors) 
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Figure 12. Data/theory ratios for tire D0 Run II inclusive jet data using a cone jet algorithm [35] 
with TVpts. = 110 and iVcorr. = 23, for MSTW08 and ABKM09 NNLO PDFs with NLO partonic 
cross sections supplemented by 2-loop threshold corrections, with scale choice (j^r = fip ~ Pt, 
and (a) all experimental errors added in quadrature, then (b) accounting for correlated systematic 
uncertainties using eq. (3.1) and showing only the uncorrelated experimental errors. 
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(b) 



D0 Run II dijet data (cone, R = 0.7) 

(data after systematic shifts, show uncorrelated errors) 
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Figure 13. Data/theory ratios for the D0 Run II dijet data using a cone jet algorithm [36] 
with A^pts. = 71 and TVcorr. 70, for MSTW08 and ABKM09 NNLO PDFs and NLO partonic cross 
sections, with scale choice fj.R = fip = Pt, where = (pti+?'T2)/2, with (a) all experimental errors 
added in quadrature, then (b) accounting for correlated systematic uncertainties using eq. (3.1) and 
showing only the uncorrelated experimental errors. 
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the MSTW08 and ABKM09 NNLO PDFs (recall that the latter give the lowest predictions 
for Tevatron Higgs cross sections) by showing the ratio of data to theory defined in two 
different ways: (a) first we use the original data points Di/Ti with uncertainties given 
by adding all errors in quadrature (including luminosity), cj*°*'/Tj, with the appropriate 
value in the plot legends obtained using eq. (3.7), then (b) we use the shifted data 
points Di/Ti with uncertainties given by crj"°™'''''/Tj, with the calculated according to 
eq. (3.1) and showing the two terms separately in the plot legends. The pT values for 
ABKM09 are slightly offset for clarity in the plots. The size of the second penalty term 
in eq. (3.1) is some measure of how much the data points are shifted compared to their 
systematic errors. For example, if the penalty term '}2,k=r '^1 ^ -^corr.i then the data 
points are shifted by, on average, more then 1-cj for each systematic source k. In general, 
a poor description of data before the systematic shifts leads to a large penalty term and 
a poor description also after the systematic shifts, although this general statement is not 
universally true. We note that the shape of the data/theory ratio, both before and after 
the systematic shifts, looks remarkably similar as a function of both transverse momentum 
Pt and rapidity y in figures 10 and 11. This demonstrates very clearly that the two CDF 
inclusive jet measurements [33, 34] each contain the same data, but simply analysed in a 
different way, and the change in analysis method is accounted for extremely well by the 
change in the theory. Hence, it is not at all surprising that the two data sets can be well 
described by the same PDF set. Indeed, it was explicitly demonstrated in ref. [34] that 
the ratios of the cross sections measured with the two jet algorithms were in reasonable 
agreement with theoretical expectations. 

3.3 Dijet production 

In table 4 and figure 13 we show similar results for the D0 Run II dijet data [36], measured 
as a function of the dijet invariant mass, iWjj, and the largest absolute rapidity, |y|maxi 

of 

the two jets with the largest transverse momentum. Again, the Mjj values for ABKM09 are 
slightly offset for clarity in figure 13. The fastnlo grids are provided with a scale choice 
proportional to the mean transverse momentum of these two jets, px = (pti +PT2)/2, 
and we show results with = fip = h = {pt/2,pt,2pt} in table 4. Taking fi = pt/^ 
leads to negative cross sections at large Mjj and large |y|max- We multiply the fastnlo 
predictions by a factor 4 to account for a mismatch in the bin width factors of the provided 
grids. There are no 2-loop threshold corrections available, so we are forced to use only 
the pure NLO partonic cross sections with the NNLO PDFs. It can be seen that the 
trend in the values for the dijet data shown in table 4 appears to be rather different 
from the inclusive jet data shown in tables 1, 2 and 3. In particular, in contrast to the 
case for inclusive jets, the ABKM09 set gives the best description for ^ = pT, whereas 
MSTW08 and NNPDF2.1 have xV^pts. ~ 2 and CTEQ6.6/CT10 has xV^pts. ~ 4-5. 
For /X = 2pj' there is a significant improvement in for MSTW08 and NNPDF2.1, and 
MSTW08 NNLO for /x = 2pT gives the best description out of all PDF sets and scale 
choices, while the CTEQ6.6/CT10 sets still have xV^pts. 

~ 3 even for the larger scale 

choice. However, it is interesting to note that while figures 10(a) and 11(a) show a very 
similar trend for the data/theory ratios, figures 12(a) and 13(a) show quite a different trend. 
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1.34 
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GJR08 
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(3.92) 
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(0.66) 



NNLO PDF (with NLO <t) 


fl=PT/2 


fl=PT 


H = 2pT 


MRST06 
MSTW08 

HERAPDFl.O, as{M\) = 0.1145 
HERAPDFl.O, as{Ml) = 0.1176 
ABKM09 
JR09 


8.06 (5.07) 
2.38 (0.63) 
2.61 (0.48) 
2.72 (0.83) 
1.36 (0.98) 
3.29 (0.42) 


6.55 (3.21) 
1.80 (0.33) 
2.55 (0.89) 
2.31 (0.50) 
1.49 (1.93) 
2.55 (0.24) 


4.07 (0.96) 
1.31 (1.24) 
2.40 (2.40) 
1.96 (1.08) 
1.57 (4.53) 
1.88 (1.26) 



Table 4. Values of x^/^pts. for the D0 dijet data using a eone jet algorithm [36] with A^pts. = 71 and 
^corr. = 70, for different NLO PDF sets and different scale choices = ^Lp = /-t = {pt/2,pt, 2pt}, 
where px = {pti +PT2)/2. Only NLO partonic cross sections are used with the NNLO PDFs, since 
the 2-loop threshold corrections are only available for the inclusive jet cross section. The values 
are calculated accounting for all 70 sources of correlated systematic uncertainty, using eq. (3.1), 
including the 6.1% normalisation uncertainty due to the luminosity determination. At most a 1-cr 
shift in normalisation is allowed. The values of x^/^pts. computed using eq. (3.7), simply adding 
all experimental uncertainties in quadrature (including luminosity), are shown in brackets in the 
table. If the theory prediction was identically zero, the x^/^pts. values would be 5.86 (60.5) with 
(without) accounting for correlations between systematic uncertainties. 

implying that the change in theory in using the NLO dijet cross section at the same scale 
as the inclusive jet cross section does not account for the difference in the data produced 
by the two methods [35, 36] of binning and analysis. 

At LO we have Mjj = 2p7^coshy* where y* = \yi — y2\/2, with yi 2 the rapidities 
of the two jets. It is clear that pT is a better measure of the "hardness" of the process 
than Mjj and therefore // = is the most common scale choice for dijet production. 
(Consider, for example, the extreme case of elastic pp scattering where each final-state 
proton is considered to be a "jet", then Mjj y^, but pT ~ 0.) More generally, typical 
scale choices in fixed-order perturbative QCD calculations are usually, for example, the 
mass or transverse momentum of a produced particle, or a scalar sum of such scales added 
either linearly or in quadrature. However, it is clear that choices of scale involving both px 
and y* are perfectly feasible for dijets, whereas some multiple of pT seems more obviously 
the scale choice for inclusive jets. There is no reason that the choice which best mimics 
the full calculation at fixed order for inclusive jets need be the same as for dijets binned in 
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Mjj, i.e. the structure of higher order corrections is not automaticahy the same. Indeed, 
a hybrid scale choice was proposed in ref. [64] to interpolate between a scale choice based 
on pt and one based on Mjj, namely ^ = AMjj/(2coshi?y*), with the two adjustable 
parameters chosen to be ^ = 0.5 and B = 0.7 so that the difference between the 0{a^g) 
calculation and the Born calculation was small over the angular region of interest [64]. It 
would be interesting to investigate whether such a scale choice could resolve the somewhat 
different conclusions reached from the Tevatron Run II inclusive and dijet data. There is no 
requirement that the scale choice for dijets be the same as for inclusive jets. Taking ^ = pT 
for inclusive jets and n = 2pT = pri +PT2 for dijets, then the MSTW08 (and NNPDF2.1) 
PDFs would give a good description of all four Tevatron Run II jet data sets [33-36]. 

Another difference, possibly correlated to the issue of scale choice, is that the dijet data 
may probe higher x values than the inclusive jet data. If there are two jets labelled "1" and 
"2" , and jet "1" has high pT in the forward region, then the phase space for the jet "2" is 
integrated over in the inclusive jet cross section, but will typically lie in the central region, 
creating an imbalance in the x values of the two initial partons. On the other hand, for 
the dijet cross section at high Mjj values, if jet "1" lies in the forward region, then jet "2" 
will typically lie at the same absolute rapidity in the opposite direction, giving similarly 
large x values of the two initial partons. Since high-x PDFs evolve very quickly, probing 
two high-x PDFs increases sensitivity to (factorisation) scale choices. This sensitivity will 
be most extreme when both PDFs are evolving quickly in the same direction (for example, 
both getting smaller with increasing scale), rather than one PDF getting smaller and one 
PDF getting larger as would be the case with one high-x parton and one low-x parton. 
This effect automatically means that the higher-order corrections must be slightly different 
in the two cases of inclusive jet and dijet production. 

3.4 Scale dependence of jet cross sections 

In figure 14 we compare the X-factors for the D0 inclusive and dijet data, defined as the 
ratio of the NLO (both with/without the 2-loop threshold corrections) jet cross sections 
to the LO jet cross sections, computed with the same MSTW08 NNLO PDFs (and as) in 
the numerator and denominator of the ratio. Using another PDF choice, such as ABKM09 
NNLO, makes little difference to the i^-factors. The choice = j5r/2 has historically 
been favoured in MRST/CTEQ fits because the iC-factor is close to 1 at central rapidity. 
However, going to forward rapidities with the choice /i = Pt/2, the i^- factor decreases 
substantially with increasing px- The i^- factor with the choice = is more uniform 
(with moderate size) across all rapidity bins and pT values, hence fJ- = Pt was chosen for 
the MSTW08 analysis [13]. It is striking, however, that although the NLO corrections are 
~ 60% for fi = 2pT, and a further 20% or more with the 2-loop threshold corrections, 
the shape of the if-factor is rather more stable across all rapidity bins and pT with this 
choice. In figure 15 we show the ratio of the NLO (both with/without the 2-loop threshold 
corrections) jet cross sections with different scale choices to the NLO jet cross section 
with fiR = fip = Pt, again computed with the same MSTW08 NNLO PDFs (and as) in 
the numerator and denominator of the ratio. It can be seen that the use of the 2-loop 
threshold corrections for the inclusive jet cross sections stabilises the scale dependence 
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Figure 14. if-factors using MSTW08 NNLO PDFs for (a) inclusive jet and (b) dijet production. 
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Figure 15. Ratio of jet cross sections to those with NLO tr and scale choice ~ fip ~ Pt using 
MSTW08 NNLO PDFs for (a) inclusive jet and (b) dijet production. 



(except at the very highest rapidity and pT values v^^here the low scale choice still leads to 
a large variation). To some extent, different scale choices will be compensated by different 
systematic shifts, particularly for the luminosity (see appendix A). The predictions for 
H = Pt are generally in the middle of the other two choices, but this breaks down at high 
rapidity and px values. Indeed, for dijets fi = pT ceases to be the central prediction at 
nearly all px in the two highest rapidity bins, and is progressively less so in the middle 
rapidity bins than for the case of inclusive jets. This supports the idea that the optimal 
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Figure 16. Distributions of the pulls, {Di — Ti) /<t!^''^'^°"- , for each of the four Tevatron data sets 
on jet production, with theory predictions calculated using either MSTW08 or ABKM09 NNLO 
PDFs, compared to the expectation of a Gaussian distribution with unit width. 



choice for dijets might be pT multiplied by a function f{y*), growing with increasing y* , so 
that fi = Pt • fiv*) would be the central prediction over all |?/|max bins. (The y* variable 
is closely related to the |2/|max variable used by the DO dijet data [36].) In the absence of 
readily-available theory predictions for such a scale choice, the best description of dijet data 
by PDFs obtained from fitting to inclusive jet data seems to be given, as a compromise, 
by a scale of fj, = cpT with c > 1, with our specific example being c = 2. 

3.5 Distributions of pulls and systematic shifts 

In figure 16 we show the distributions of pulls, {Di — Ti)/af^^°^^', for all four Tevatron Run 
II data sets on jet production, with theory predictions calculated using either MSTW08 
or ABKM09 NNLO PDFs and a scale choice = fip = fi = px- We show the expected 
behaviour of a Gaussian distribution with unit width, and the first term in eq. (3.1) 
given simply by the sum of pulls over all data points. The histogram error bars are simply 
given by the square root of the number of entries. We see that the distribution of pulls 
is fairly close to the expected Gaussian behaviour for all four data sets, although the tails 
for the inclusive jet data with ABKM09 are somewhat broader than expected, leading to 
larger contributions than for MSTW08, particularly for the CDF data using the kx 
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Figure 17. Distributions of the systematic shifts, given by eq. (3.3), for each of the four Tevatron 
data sets on jet production, with theory predictions calculated using either MSTW08 or ABKM09 
NNLO PDFs, compared to the expectation of a Gaussian distribution with unit width. 



jet algorithm [33] shown in figure 16(a). However, it is clear that this source does not 
account for the complete differences in seen previously. In figure 17 we show the similar 
distributions of the systematic shifts, r^, again for all four Tevatron Run II data sets on jet 
production. We show the expected behaviour of a Gaussian distribution with unit width 
and the penalty term simply given by the sum of the values. For the inclusive jet 
data, the systematic shifts for MSTW08 show the expected Gaussian behaviour, with small 
penalty terms X^^i" r"^ < A'^corr.- On the other hand, the systematic shifts for ABKM09 
deviate substantially from Gaussian behaviour, with much larger penalty terms, in par- 
ticular for the CDF inclusive jet data using the fc^ algorithm shown in figure 17(a). The 
systematic shifts for the dijet data shown in figure 17(d) have a much narrower distribu- 
tion than the expected Gaussian behaviour for both MSTW08 and ABKM09, suggesting 
that the systematic errors are overestimated, are non-Gaussian, or are not independent (or 
a combination of these three explanations). Note that the number of systematic sources 
(-^corr. = 70) for the dijet data is much greater than for any of the inclusive jet data sets. 
Indeed, this allows the value of for the description of data by an identically zero theory 
prediction to be lower than for some of the PDF sets; see table 4. 

The presentation of the results in figures 16 and 17 enables a separation between 
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contributions to the definition, eq. (3.1), from uncorrelated and correlated errors, re- 
spectively. This allows a more informed assessment of the fit quality compared to the more 
traditional definition of eq. (3.6) in terms of the experimental covariance matrix, used 
by the ABKM and NNPDF fitting groups; see also appendix B.3 of ref. [42] and section 4 
of ref. [18]. 

3.6 Other jet cross sections from collider experiments 

The D0 Collaboration has recently made a measurement [65] of the three-jet differential 
cross section as a function of the invariant mass of the three jets with the largest transverse 
momentum in an event. An exercise has been carried out similar to the one presented 
here, where the has been evaluated for different PDF (and as) choices and scale choices 
/^i? = /^F = A* = {pt/'^,Pt,'^Pt}, where the mean jet pT = (pti + PT2 +PT3)/3. The trend 
is that MSTW08 and NNPDF2.1 are favoured, as for the inclusive jet study presented here, 
while ABKM09 is worse, and CTIO and HERAPDFl.O are stiU poorer. We have followed 
a similar approach to that of ref. [65] by evaluating the only for the central PDF fit, 
without accounting for PDF uncertainties. Since the Tevatron jet data provide by far the 
most direct constraint on the high-x gluon distribution, the agreement of the central PDF 
fit is more important and relevant than obtaining agreement only within possibly large 
PDF uncertainties. However, the potential choice of scales for the three-jet cross section is 
even broader than for the dijet cross section. 

The LHC data on jet production [66-69] are becoming more precise and show some 
sensitivity to the PDF choice. However, these data are still being understood and are not 
presented with separated correlated systematic uncertainties which would allow a quanti- 
tative x^ comparison. Moreover, the general sensitivity is to lower xt ~ 2pT/y/s, and so 
less relevant for Higgs production at the Tevatron. Isolated photon production at the LHC 
may also provide a direct constraint on the gluon distribution [70]. The HERA jet data 
are less sensitive to the gluon distribution at high x values, being more of a constraint for 
X ~ 0.001-0.1, and there is no NNLO calculation, or any approximation such as the 2-loop 
threshold corrections available for the Tevatron inclusive jet data. 

3.7 Summary 

Comparison with Tevatron jet data is subtle because of the large correlated systematic un- 
certainties. The systematic shifts, eq. (3.2), can compensate for inadequacies in the theory 
calculation. The traditional x^ definition in terms of the experimental covariance matrix, 
eq. (3.6), can hide such systematic shifts. In particular, we find that the Tevatron jet data 
need to be normalised downwards by typically between 3-cr and 5-0" (see appendix A) to 
achieve the best agreement with some PDF sets, particularly the ABKM09 predictions. 
Even if the luminosity shift is artificially constrained, the other systematic shifts move by 
large amounts for the inclusive jet data, incompatible with the Gaussian expectation. No 
such problems are observed for the MSTW08 predictions. It can also be seen from the 
plots in ref. [12] that the unshifted Tevatron jet data lie significantly above the theory 
predictions even after including these data in variants of the ABKM09 fit. Constraining 
the Tevatron luminosity shifts, for example, so that the predicted W and Z cross sections 
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agreed with Tevatron data, would increase the constraining power of the Tevatron jet data 
and thereby very hkely give a larger as and high-x gluon distribution than the current 
studies of Alekhin, Bliimlein and Moch (ABM) [12]. Even with the existing treatment, the 
NNLO Tevatron gg ^ H cross section for Mh = 165 GeV goes up by {15,12,17,11}% 
when including the {CDF kr [33], CDF Midpoint [34], D0 inclusive [35], D0 dijet [36]} 
data set in variants of the ABKM09 fit [12]. The dijet data has a potentially wider range 
of allowed scale choices than the inclusive jet data. We conclude that the data on inclusive 
jet production therefore provide the cleanest probe of different PDF sets. 

4 Value of strong coupling 0:5 from DIS 

There is a common lore (see, for example, ref. [71]) that DIS-only fits prefer low 05 (M^) 
values, but ref. [32] showed that not all DIS data sets prefer low 05 (M^) values. In 
particular, this was found to be true only for BCDMS data, and for E665 and SLAC 
ep data, while NMC, SLAC ed and HERA data preferred high as{M'^) values within 
the context of the global fit [32]. (See also the recent NNPDF study at NLO using an 
"unbiased" PDF parameterisation [72].) 

It is well known that as is highly anticorrelated with the low- 2; gluon distribution 
through scaling violations of HERA data: dF2/dln{Q'^) ~ as g- Then as is correlated 
with the high-x gluon distribution through the momentum sum rule; see, for example, 
figure 14(b) of ref. [32]. Restrictive gluon parameterisations, without the negative small- 2; 
term allowed by MSTW [13], can therefore bias the extracted as value. For example, 
the default MSTW08 NNLO fit obtained as(M|) = 0.1171 ± 0.0014, while imposing the 
restriction of a positive input gluon at Qq = 1 GeV^ gave a best-fit asiM^) = 0.1157, but 
with a worse by 63 units for the global fit to 2615 data points [32].^ 

What is as from only DIS data in the MSTW08 NNLO fit?^ Recall that the global 
fit gave as(M|) = 0.1171 ± 0.0014 [32]. To expand on the studies made in ref. [32], 
we performed a new NNLO DIS-only fit, which gave a best-fit as{M'^) = 0.1104, but 
with an input gluon distribution which went negative for x > 0.4 due to lack of any 
data constraint. This implies a negative charm structure function, i?charm^ ^^^^ terrible 
description {x^ /^pts. ~ 10 including correlated systematic errors) of Tevatron jet data 
using the obtained PDFs. A DIS-only fit fixing the high-x gluon parameters to prevent 
such bad behaviour gave as (M|) = 0.1172, i.e. very similar to the global fit. However, 
a NNLO fit which imposed the condition of the positive low-x gluon, which stopped the 
gluon from going negative at high x values, and which also omitted the Tevatron jet 
data, gave as (M|) = 0.1139, rather closer to the ABKM09 value. The very low value 
of a5(M|) = 0.1104 found in the DIS-only fit is due to the dominance of BCDMS data. 
We can show this explicitly by removing the BCDMS data from the DIS-only fit, then 
the best-fit asiM^) moves from 0.1104 to 0.1193. Repeating the global fit with BCDMS 
data removed gives as(M^) = 0.1181, i.e. a change by less than the quoted experimental 
uncertainty of ±0.0014. The conclusion is that the Tevatron jet data are vital to pin down 

^The values for the increase of 80 at NLO and 63 at NNLO were erroneously interchanged in ref. [32]. 
^Studies prompted by question from G. Altarelli, December 2010. 
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the high-x gluon, giving a smaller low-x gluon and therefore a larger as in the global fit 
compared to a DIS-only fit, at the expense of some deterioration in the fit quality of the 
BCDMS data.^ The benefits of including the Tevatron jet data to obtain sensible results 
in a simultaneous fit of PDFs and as therefore greatly outweighs any disadvantage such 
as lack of complete NNLO corrections. 

The only input DIS value to the current world average as(M|) [31] is the BBG06 
value [74] , which is from a non-singlet analysis and therefore in principle free of assumptions 
made about the gluon distribution. A value of 

asiMl) = {0.1148l°:°°}^,0.1134t°:0°i?,0.114ll°:0°^°} (4.1) 

was obtained at {NLO, NNLO, N'^LO}, by fitting proton and deuteron structure functions, 
F2 and F2, for x > 0.3 (assuming only valence quarks, neglecting the singlet contribution), 
and the less precise ^2^^ = 2(Ff - F^) for x < 0.3. However, using the MSTW08 NNLO 
central fit, contributions other than valence quarks are found to make up about 10% (2%) 
of F| at x = 0.3 {x = 0.5). As an exercise we performed the MSTW08 NNLO DIS- 
only fit just to F2 and F2 for x > 0.3 (comprising 282 data points, 160 of these from 
BCDMS), which gave as{M^) = 0.1103 (0.1130) without (with) the singlet contribution 
included. This is even lower than the BBG06 value presumably due to lack of the y > 0.3 
cut on BCDMS data applied in the BBG06 analysis. The low value of as{M'^) found by 
BBG06 [74] is therefore due to both dominance of BCDMS data and by what we conclude 
is the unjustified neglect of the singlet contribution to F2 and F2 for x > 0.3. Given that 
it was argued above that the Tevatron jet data are needed to pin down the high-x gluon, 
we conclude that an extraction of as{M'^) only from inclusive DIS data is not meaningful, 
and the closest possible to a reliable extraction is the MSTW08 NNLO combined analysis 
of DIS, Drell-Yan and jet data [13, 32]: 

as{Ml) = 0.1171 ± 0.0014 (68% C.L.) ± 0.0034 (90% C.L.). (4.2) 

This value is the only NNLO determination, from a simultaneous fit with PDFs, which is in 
agreement with the current world average Q5'(M|) = 0.1184 ± 0.0007 [31]; see figure 5(b). 

5 Treatment of NMC data and stability to low data 

A recent claim has been made [11] that the bulk of the MSTW08/ABKM09 difference 
in both the extracted 05 (M^) value and the gg ^ H predictions is explained by the 
treatment of NMC data [14]. The differential cross section for DIS of charged leptons off 
nucleons, iN — )• IX, neglecting the nucleon and lepton masses, and assuming single-photon 
exchange, is 



(Pa An:o? 



da;dQ2 



l-y + 



yV2 



l + i?(x,Q2) 



F2(x,Q2), (5.1) 



''The low y data points from BCDMS are strongly affected by the energy scale uncertainty of the scattered 
muon. It has been advocated to impose a cut of y > 0.3 on the BCDMS data, which caused as(Mf ) to 
increase by about 0.004 in a fit to only BCDMS data and by about 0.002 in a combined fit to Hf and 
BCDMS data [73]. 
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Figure 18. (a) R = gl/ aT — Fl/{F2 — Fl) versus (in units of GeV^) at a: = 0.025 comparing 
the Q^-independent i?NMC extraction [14], the Q^-dependent SLAG i^iggo parameterisation [75], 
and the MSTW08 NNLO calculation including 1-cr PDF uncertainties [13]. (b) F2 versus Q"^ (in 
units of GeV^) at x = 0.025 comparing the two NMC extractions [14] using either i?NMC or -Riggo- 



where R = ctl/(^t — Fl/{F2 — Fl) is the ratio of the 7*A^ cross sections for longitudi- 
nally and transversely polarised photons, is the photon virtuality, x is the Bjorken 
variable and y ~ Q"^ /{xs) is the inelasticity (with t/s the IN centre-of-mass energy). The 
ABKM09 [15] analysis fitted the NMC differential cross sections directly, calculating Fl 
to 0{ag) and including empirical higher-twist corrections. The MSTW08 [13] analysis 
instead fitted the NMC F2 values corrected for i?, where [14] 

i^(:.,Q^) = |''"""^"^ (5.2) 
\iii99o(x,Q2) if re > 0.12 

Here, R^mc{^) was a (Q^-independent) value extracted from NMC data, while i?i99o(x, Q"^) 
was a Q^-dependent empirical parameterisation of SLAC data dating from 1990 [75]. By 
replacing the NMC differential cross-section data by NMC F2 data, ABM [11] find that 
their best-fit as{M'^) moves from 0.1135 to 0.1170 and their gg ^ H cross sections at the 
Tevatron and LHC move closer to the MSTW08 values. ABM [11] therefore conclude that 
the use of NMC F2 data in the MSTW08 fit rather than the differential cross section is the 
main reason for the higher as^M^) and Higgs cross sections obtained with MSTW08. 

We agree that it is more consistent to fit directly to the NMC differential cross-section 
data, so here we respond to this rather dramatic assertion made by ABM [11], which would 
obviously be very worrying if correct. However, rather than repeat the MSTW08 analysis 
by fitting the NMC differential cross sections, we note that the original NMC paper [14] 
made an alternative extraction of F2 values using the SLAC Riggo parameterisation [75]. 
In figure 18(a) we compare i^NMC with i?i99o in the most affected bin of rc = 0.025, i.e. a 
low X value where there are a reasonable number (7) of NMC data points surviving the 
cut on > 2 GeV^ and where the difference between i?NMC and -R1990 is at its largest. 
Recall that a low x value means a high y value and from eq. (5.1) the correction term from 
R is only important at large y. In figure 18(a) we also show the MSTW08 NNLO predic- 
tion, including PDF uncertainties at 68% C.L., with calculated to 0{a^) and without 
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Table 5. Effect of NMC treatment on as(M\) and Higgs cross sections (M// = 165 GeV). We also 
show the effect of raising the cuts imposed on the DIS data compared to the default of removing 
data with < 2 GcV^ and < 15 GeV^. Finally, we show the effect of simply fixing as(M|) to 
be close to the ABKM09 value, or performing a fit with a positive-definite input gluon distribution 
and no jet data, and we compare directly to ABKM09. 



any higher- twist corrections. We see that it gives a good description of the SLAC -R1990 
parameterisation, with any differences being very much smaller than those between -Rnmc 
and i?i99o. We note that NMC/BCDMS/SLAC Fl data are included in the MSTW08 fit 
and are well-described at NNLO but less well at NLO (see figure 5 of ref. [32]), so the 
C(a|) coefficient functions are needed for a good description and the larger MSTW08 
05 (M^) perhaps explains why there is less room for higher- twist corrections, contrary to 
the findings of the ABM analysis. Nevertheless, figure 18(a) demonstrates that fitting the 
alternative NMC F2 data extracted using the SLAC -R1990 parameterisation will give very 
similar results to fitting the NMC differential cross sections. In fact, given that Rx^qq in 
figure 18(a) generally has a slightly steeper dependence than the MSTW08 parameter- 
isation, using this will slightly overestimate the true impact of fitting the NMC differential 
cross sections. In figure 18(b) we compare the two different NMC F2 extractions, again 
for the most affected bin of x = 0.025, and we see that there is little difference, certainly 
nothing that seems likely to change 05 (M|) by 0.0035 in a fit where it is constrained with 
an uncertainty of about 0.0014 by over 2000 other data points. 

In table 5 we show the effect of repeating the MSTW08 NNLO fit with the NMC F2 
data extracted using -R1990 on as{M'^) and the Higgs cross sections (for Mh = 165 GeV) 
at the Tevatron and LHC, and in figure 19 we show the change in the gluon distribution at 
the corresponding scale. We make other fits either cutting the NMC F2 data for x < 0.1, 
above which the R correction in eq. (5.1) is very small indeed, or completely removing all 
NMC F2 data. In all cases there is very little change in 05 (M^), the gluon distribution, and 
the Higgs cross section. We conclude that the treatment of NMC data cannot explain the 
difference between the MSTW08 and ABKM09 results. Similar stability has been found 
by the NNPDF group [76], but in a less relevant study at NLO with fixed as- 

The cuts on DIS data are not explicitly given in the ABKM09 paper [15], but the 
previous AMP06 paper [77] mentions that DIS data are removed with < 2.5 GeV^ and 
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Figure 19. Effect of NMC treatment on the gluon distribution at a scale Q'^ ~ (165 GeV)^. The 
values of a: = Mh/^/s relevant for central production (assuming = 0) of a Standard Model 
Higgs boson of mass Mh ~ 165 GcV at the Tevatron and LHC are indicated. We also show the 
effect of raising the cuts imposed on the DIS data compared to the default of removing data with 
<2 GeV^ and W"^ < 15 GeV^. Finally, we show the effect of simply fixing as{M^) to be close 
to the ABKM09 value, or performing a fit with a positive-definite input gluon distribution and no 
jet data, and we compare directly to ABKM09. 



W'^ < (1.8 GeV)2 = 3.24 GeV^ compared to the MSTW08 fit which removes DIS data 
with < 2 GeV^ and W"^ < 15 GeV^. The much weaker cut on the hadronic invariant 
mass (squared), W'^ ~ Q'^{l/x — 1), clearly explains why higher-twist corrections are more 
important in the ABKM09 analysis. To investigate the possible effect of neglected higher- 
twist corrections on the MSTW08 NNLO fit we raised the cuts to remove DIS data with 

< 20 GeV^ and either < 5 GeV^ or < 10 GeV^. The results are shown in table 5 
and figure 19. The changes in as, the gluon distribution and the Higgs cross sections are 
generally small and within uncertainties, although with the strongest cut there is no 
data constraint below x = 10~^ and little just above, so the PDFs differ but have large 
uncertainties at low x values.^ 

In table 5 and figure 19 we show the results of the MSTW08 NNLO fit with a fixed 

*We also investigated the effect of increasing the cuts on and in variants of the MSTW NLO fit. 
The changes were slightly bigger, with as{M^) changing from 0.1202 to 0.1192 and 0.1175 with cuts 
of 5 and 10 GeV^ , respectively. Similarly, the clianges in PDFs and cross-section predictions are generally 
slightly greater at NLO than at NNLO, i.e. as expected there is some improved stability at higher orders. 



-33- 



as{M^) = 0.113 [32] (slightly below the ABKM09 value), and even in this case the gluon 
distribution and Higgs cross sections move only part of the way towards the ABKM09 
result, as already seen in figure 7. The MSTW08 input gluon parameterisation is [13] 



xg{x, Ql = 1 GeV^) = Ag (1 - (1 + + 7^ x) + Ag, xV (1 - x^'^ , (5.3) 



compared to the much more restrictive functional forms of the other NNLO fits, namely: 



The normalisation Ag is determined from the momentum sum rule constraint, leaving 
7 free parameters for MSTW08 compared to only 3 for ABKM09 and only 2 for JR09 
and HERAPDFl.O (although the value of Qq is optimised in the case of JR09). In the 
lack of any direct data constraint on the high-x gluon distribution, the other fits are 
therefore constrained by the form of the input parameterisation, avoiding the pathological 
behaviour of the negative high-x gluon distribution seen for the MSTW08 NNLO DIS-only 
fit described in section 4. As already mentioned in that section, in an attempt to mimic 
the ABKM09 fit we performed a variant of the MSTW08 NNLO fit without jet data and 
with the second term of eq. (5.3) set to zero. The Eg and 7^ parameters were fixed in the 
fit iteration before the high-x gluon distribution went negative. The results of this fit are 
shown in table 5 and figure 19 and it goes some way towards reproducing the high-x gluon 
of the ABKM09 fit and the corresponding Tevatron gg ^ H prediction, certainly closer 
than we come with other modifications. Finally, we then investigated the effect of using 
NMC data corrected using -R1990 rather than i?NMC in this fit. Similar to our default fit 
all changes were at the percent level, or less, so we do not explicitly show them, although 
the gluon does move marginally closer again to that of ABKM09. 

Other differences between the two analyses are that ABKM09 used the NMC data for 
separate muon beam energies, whereas MSTW08 used the NMC data averaged over beam 
energies, which reduces the maximum effect of the change in R for a particular data point, 
i.e. at a given x and Q^, a data point at high y, and so very sensitive to i? at a low beam 
energy, is at lower y for a higher beam energy. Li the case of the averaged NMC data, 
correlated systematic uncertainties are unavailable, so the MSTW08 fit simply added errors 
(other than normalisation) in quadrature similar to the simple form of eq. (3.7). As 
with the Tevatron jet data, deficiencies in the theory calculation may be hidden, without 
much trace, by large systematic shifts implicit in the definition, eq. (3.6), similar to that 
used in the ABKM09 analysis. We conclude that the greater sensitivity to the treatment of 
NMC data found by ABM [11] is due to a variety of reasons, but perhaps most significantly, 
the inclusion of higher-twist corrections due to the weaker cuts on DIS data, and, as we 
have repeatedly emphasised, the lack of additional constraints provided by the Tevatron 
jet data to pin down the high-x gluon distribution. 



ABKM09 [15] 
JR09 [21] 
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9 GeV^) = Agx^^ {l-xf^ x'^'- 
0.55 GeV^) = Agx^^ (1 - x)^« 
1.9 GeV^) = Agx^^ (1 -x)''9. 



(5.4) 
(5.5) 
(5.6) 
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6 Conclusions 



The anomalously large higher-order QCD corrections to Higgs production at the Tevatron 
and LHC, via the dominant production channel of gluon-gluon fusion through a top-quark 
loop, mandate the use of (at least) NNLO calculations, together with corresponding NNLO 
PDFs and as values. The Tevatron Higgs cross section, in particular, requires knowledge 
of the gluon distribution at large x > 0.1 where constraints from DIS or Drell-Yan data 
are weak and the only direct constraint comes from Tevatron inclusive jet production. The 
MSTW08 fit [13] is currently the only pubhc NNLO PDF set including the Tevatron jet 
data, and is used in the analyses of the Tevatron [4, 5] and LHC [3] experiments, while other 
NNLO PDF fitting groups (ABKM09 [15], JR09 [21, 22], HERAPDFl.O [23]) choose to 
omit it, finding quite different results for the predicted Higgs cross sections. This common 
choice to use only the MSTW08 set, and not the other publicly available NNLO PDF sets, 
has faced a barrage of recent criticism [7-12], which we have responded to in detail in this 
paper. We summarise our main findings below: 

• We do not recommend that the (experimental) PDF-I-qs uncertainty be supple- 
mented with an additional theoretical uncertainty on as when calculating uncer- 
tainties on predicted cross sections, contrary to the approach taken in refs. [7, 8]. 

• The claim [9] that the HERAPDFl.O NNLO set with a5(M|) = 0.1145 lowers the 
Higgs cross section compared to MSTW08 by ~ 40% for Mh ~ 160 GeV at the 
Tevatron is due to a mistake in the calculation, and therefore the conclusions in 
the published version of ref. [9] are flawed. On the other hand, the observed 25% 
reduction with the central value of ABKM09 is still a serious problem and we give 
evidence in this paper that the ABKM09 set is not consistent enough with existing 
Tevatron data to be used for the calculation of Higgs cross sections. 

• Comparison with Tevatron jet data is subtle because of the large correlated systematic 
uncertainties and the need to make choices in luminosity which are consistent with 
the predictions for W and Z cross sections. The traditional definition in terms of 
the experimental covariance matrix, eq. (3.6), can hide large systematic shifts, which 
can compensate for inadequacies in the theory calculation. In particular, we find 
that the Tevatron jet data need to be normalised downwards by typically between 
3-0" and 5-0" to achieve the best agreement with the ABKM09 (and some HERAPDF) 
predictions; see appendix A. Even if the luminosity shift is artificially constrained, the 
other systematic shifts move by large amounts for the inclusive jet data, incompatible 
with the Gaussian expectation. No such problems are observed for the MSTW08 
predictions and good agreement is found with all Run H inclusive jet data, and also 
with the dijet data if taking a larger scale choice than for the inclusive jet data. 

• We have demonstrated that the MSTW08 fit is stable to the treatment of NMC F2 
data, unlike the ABKM09 fit [11], most likely because of the averaging over muon 
beam energies, because the Tevatron jet data pin down the high-x gluon distribution, 
and also due to the stronger cuts reducing the need for large higher- twist corrections. 
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MSTW08 


0.75 (+0.32) 


0.68 (-0.88) 
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CTEQ6.6 


1.03 {-247) 


1.04 (-3.49) 
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U 1 iU 
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NNPDF2.1 


0.74 (-0.33) 


0.79 (-i.eO) 


0.80 (-3.12) 


HERAPDFl.O 


1.52 (-4.07) 


1.57 (-5.21) 


1.43 (-6.22) 


HERAPDF1.5 


1.48 (-3.85) 


1.52 (-5.00) 


1.39 (-6.03) 


ABKM09 


1.03 (-3.49) 


1.01 (-4.53) 


1.05 (-5.80) 


GJR08 


1.14 {+247) 


0.93 {+1.25) 


0.79 (-0.50) 



NNLO PDF (with NLO+2-loop a) 


= pt/2 


ll=PT 


11 = 2pT 


MRST06 
MSTW08 

HERAPDFl.O, as{Ml) = 0.1145 
HERAPDFl.O, as{Ml) = 0.1176 
ABKM09 
JR09 


2.80 {+2.23) 
1.39 (+0.35) 
2.37 {-2.65) 
2.24 (-0.48) 
1.53 (-4.27) 
0.75 (+0.13) 


1.20 {+1.34) 
0.69 (-0.45) 
1.48 (-3.64) 
1.13 {-1.60) 
1.23 (-5.05) 
1.26 (-0.61) 


1.03 (+0.53) 
0.97 {-1.30) 
1.29 (-4.12) 
1.09 {-2.23) 
1.44 (-5.65) 
2.20 {-1.22) 



Table 6. Values of x^/^pts. for the CDF Run II inelusive jet data using the kx jet algorithm [33] 
with A^pts. = 76 and iVcorr. = 17, for different PDF sets and different scale choices = /if = yu = 
{pt/2,pt, "^Pt}- The values are calculated accounting for all 17 sources of correlated systematic 
uncertainty, using eq. (3-1), including the 5.8% normalisation uncertainty due to the luminosity 
determination. No restriction is imposed on the shift in normalisation and the optimal value of 
"— riumi." is shown in brackets, where the data points are shifted as Di — > ^^(l — 0.058 riumi.); see 
eq. (3.2). Values of |riumi.| G [1, 3] are shown in italics and values |numi.| > 3 are shown in bold. If 
the theory prediction was identically zero, then x^/-^pts. = 3.43 with Humi. = 15.1. 

Moreover, the MSTW08 NNLO determination of the strong coupling as is compatible 
with the world average value, unlike other NNLO determinations shown in figure 5(b). 

We conclude that the current Tevatron Higgs exclusion bounds [4, 5] are robust, at least 
with respect to the treatment of PDFs and as in the calculation of the Higgs cross section. 
Similar remarks hold for the Higgs cross sections at the LHC recently calculated in ref. [3]. 

A Appendix: tables with unrestricted luminosity shifts 

For completeness, in tables 6, 7, 8 and 9 we show x^/^pts. values without the restriction 
in the luminosity shifts of |numi.| ^ 1 imposed in the main tables given in section 3. Recall 
from eq. (3.2) that a positive value of riumi. means a downwards shift in the luminosity, so 
we choose to give in brackets the values of "— numi.", he. negative numbers correspond to 
downwards shifts in the luminosity. In the table captions we give the values with an 
identically zero theory prediction (Tj = 0) just to illustrate an extreme case of how large 
downwards luminosity shifts can partially accommodate an inadequate theory prediction. 
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NNPDF2.1 


1.69 (+0.30) 


1.56 {-1.01) 


1.40 {-2.20) 


HERAPDFl.O 


2.49 (-2.84) 


2.45 (-3.86) 


2.11 (-4.54) 


HERAPDF1.5 


2.39 


2.36 (-3.72) 


2.05 (-4.42) 


ABKM09 


1.52 (-2.05) 


1.53 (-3.10) 


1.38 (-4.04) 


GJR08 


2.02 


1.75 {+1.18) 


1.52 (-0.26) 



NNLO PDF (with NLO+2-loop a) 


/i = pt/2 


fl=PT 


H = 2pT 


MRST06 
MSTW08 

HERAPDFl.O, as{Ml) = 0.1145 
HERAPDFl.O, a5(M|) = 0.1176 
ABKM09 
JR09 


2.72 {+2.83) 
1.66 {+1.54) 
2.20 (-i.i5) 
2.08 (+0.63) 
1.63 {-2.42) 
1.57 (+0.87) 


2.07 (+Li^) 
1.39 (+0.06) 
1.99 {-2.45) 
1.76 (-0.97) 
1.73 (-3.50) 
2.05 (-0.55) 


2.11 (+0.12) 
1.62 (-1.00) 
2.04 (-3.06) 
1.96 {-1.78) 
1.93 (-4.15) 
2.81 {-1.44) 



Table 7. Values of x^/^pts. for the CDF Run II inelusive jet data using the cone-based Midpoint 
jet algorithm [34] with A^pts. = 72 and iVcon-. = 25, for different PDF sets and different scale 
choices = = M = {pt/'^^jPt, "^Pt}- The values are calculated accounting for all 25 sources 
of correlated systematic uncertainty, using eq. (3.1), including the 5.8% normalisation uncertainty 
due to the luminosity determination. No restriction is imposed on the shift in normalisation and the 
optimal value of numi." is shown in brackets, where the data points are shifted as Di — > Di{l — 
0.058 riumi.); see eq. (3.2). Values of |riumi.| G [1,3] are shown in italics and values |numi.| > 3 are 
shown in bold. If the theory prediction was identically zero, then x^/-^pts. = 2.44 with riumi. = 10.2. 
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